Software Cost Estimation with Incomplete Data

نویسندگان

  • Kevin Strike
  • Khaled El Emam
  • Nazim H. Madhavji
چکیده

The construction of software cost estimation models remains an active topic of research. The basic premise of cost modelling is that a historical database of software project cost data can be used to develop a quantitative model to predict the cost of future projects. One of the difficulties faced by workers in this area is that many of these historical databases contain substantial amounts of missing data. Thus far, the common practice has been to ignore observations with missing data. In principle, such a practice can lead to gross biases, and may be detrimental to the accuracy of cost estimation models. In this paper we describe an extensive simulation where we evaluate different techniques for dealing with missing data in the context of software cost modelling. Three techniques are evaluated: listwise deletion, mean imputation and eight different types of hot-deck imputation. Our results indicate that all the missing data techniques perform well, with small biases and high precision. This suggests that the simplest technique, listwise deletion, is a reasonable choice. However, this will not necessarily provide the best performance. Consistent best performance (minimal bias and highest precision) can be obtained by using hot-deck imputation with Euclidean distance and a z-score standardisation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presented a method for estimating the cost of software using PCA to reduce the size and with the help of data mining

  These days, data mining one of the most significant issues. One field data mining is a mixture of computer science and statistics which is considerably limited due to increase in digital data and growth of computational power of computer. One of the domains of data mining is the software cost estimation category. In this article, classifying techniques of learning algorithm of machine ...

متن کامل

A New Empirical Model to Increase the Accuracy of Software Cost Estimation (TECHNICAL NOTE)

We can say a software project is successful when it is delivered on time, within the budget and maintaining the required quality. However, nowadays software cost estimation is a critical issue for the advance software industry. As the modern software’s behaves dynamically so estimation of the effort and cost is significantly difficult. Since last 30 years, more than 20 models are already develo...

متن کامل

A Model-Driven Decision Support System for Software Cost Estimation (Case Study: Projects in NASA60 Dataset)

Estimating the costs of software development is one of the most important activities in software project management. Inaccuracies in such estimates may cause irreparable loss. A low estimate of the cost of projects will result in failure on delivery on time and indicates the inefficiency of the software development team. On the other hand, high estimates of resources and costs for a project wil...

متن کامل

Software Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms

A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...

متن کامل

Analyzing Effort Estimation in Multistage based FL-COCOMO II Framework using various Fuzzy Membership Functions

Software development has always been characterized by some metrics. One of the greatest challenges for software developers lies in predicting the development effort for a software system which is based on developer abilities, size, complexity and other metrics. Several algorithmic cost estimation models such as Boehm’s COCOMO, Albrecht's' Function Point Analysis, Putnam’s SLIM, ESTIMACS etc. ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Software Eng.

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2001